A new and efficient neural-network and finite-difference hybrid method is developed for solving Poisson equation in a regular domain with jump discontinuities on embedded irregular interfaces. Since the solution has low regularity across the interface, when applying finite difference discretization to this problem, an additional treatment accounting for the jump discontinuities must be employed. Here, we aim to elevate such an extra effort to ease our implementation by machine learning methodology. The key idea is to decompose the solution into singular and regular parts. The neural network learning machinery incorporating the given jump conditions finds the singular solution, while the standard finite difference method is used to obtain the regular solution with associated boundary conditions. Regardless of the interface geometry, these two tasks only require supervised learning for function approximation and a fast direct solver for Poisson equation, making the hybrid method easy to implement and efficient. The two- and three-dimensional numerical results show that the present hybrid method preserves second-order accuracy for the solution and its derivatives, and it is comparable with the traditional immersed interface method in the literature. As an application, we solve the Stokes equations with singular forces to demonstrate the robustness of the present method.
translated by 谷歌翻译
在本文中,开发了用于求解具有delta功能奇异源的椭圆方程的浅丽兹型神经网络。目前的工作中有三个新颖的功能。即,(i)Delta函数奇异性自然删除,(ii)级别集合函数作为功能输入引入,(iii)它完全浅,仅包含一个隐藏层。我们首先介绍问题的能量功能,然后转换奇异源对沿界面的常规表面积分的贡献。以这种方式,可以自然删除三角洲函数,而无需引入传统正规化方法(例如众所周知的沉浸式边界方法)中常用的函数。然后将最初的问题重新重新审议为最小化问题。我们提出了一个带有一个隐藏层的浅丽兹型神经网络,以近似能量功能的全局最小化器。结果,通过最大程度地减少能源的离散版本的损耗函数来训练网络。此外,我们将界面的级别设置函数作为网络的功能输入,并发现它可以显着提高训练效率和准确性。我们执行一系列数值测试,以显示本方法的准确性及其在不规则域和较高维度中问题的能力。
translated by 谷歌翻译
在本文中,开发了一种新的不连续性捕获浅神经网络(DCSNN),以近似于$ d $ d $二维的分段连续功能和解决椭圆界面问题。当前网络中有三个新颖的功能。即,(i)跳跃不连续性被准确捕获,(ii)它完全浅,仅包含一个隐藏层,(iii)它完全无网格,用于求解部分微分方程。这里的关键想法是,可以将$ d $维的分段连续函数扩展到$(d+1)$ - 尺寸空间中定义的连续函数,其中增强坐标变量标记每个子域的零件。然后,我们构建一个浅神经网络来表达这一新功能。由于仅使用一个隐藏层,因此训练参数(权重和偏见)的数量与隐藏层中使用的维度和神经元线性缩放。为了解决椭圆界面问题,通过最大程度地减少由管理方程式,边界条件和接口跳跃条件组成的均方误差损失来训练网络。我们执行一系列数值测试以证明本网络的准确性。我们的DCSNN模型由于仅需要训练的参数数量中等(在所有数值示例中使用了几百个参数),因此很有效,结果表明准确性良好。与传统的基于网格的浸入界面方法(IIM)获得的结果相比,该方法专门针对椭圆界面问题而设计,我们的网络模型比IIM表现出更好的精度。我们通过解决一个六维问题来结论,以证明本网络在高维应用中的能力。
translated by 谷歌翻译
With the fast development of big data, it has been easier than before to learn the optimal decision rule by updating the decision rule recursively and making online decisions. We study the online statistical inference of model parameters in a contextual bandit framework of sequential decision-making. We propose a general framework for online and adaptive data collection environment that can update decision rules via weighted stochastic gradient descent. We allow different weighting schemes of the stochastic gradient and establish the asymptotic normality of the parameter estimator. Our proposed estimator significantly improves the asymptotic efficiency over the previous averaged SGD approach via inverse probability weights. We also conduct an optimality analysis on the weights in a linear regression setting. We provide a Bahadur representation of the proposed estimator and show that the remainder term in the Bahadur representation entails a slower convergence rate compared to classical SGD due to the adaptive data collection.
translated by 谷歌翻译
Model counting is a fundamental problem which has been influential in many applications, from artificial intelligence to formal verification. Due to the intrinsic hardness of model counting, approximate techniques have been developed to solve real-world instances of model counting. This paper designs a new anytime approach called PartialKC for approximate model counting. The idea is a form of partial knowledge compilation to provide an unbiased estimate of the model count which can converge to the exact count. Our empirical analysis demonstrates that PartialKC achieves significant scalability and accuracy over prior state-of-the-art approximate counters, including satss and STS. Interestingly, the empirical results show that PartialKC reaches convergence for many instances and therefore provides exact model counting performance comparable to state-of-the-art exact counters.
translated by 谷歌翻译
Robots are traditionally bounded by a fixed embodiment during their operational lifetime, which limits their ability to adapt to their surroundings. Co-optimizing control and morphology of a robot, however, is often inefficient due to the complex interplay between the controller and morphology. In this paper, we propose a learning-based control method that can inherently take morphology into consideration such that once the control policy is trained in the simulator, it can be easily deployed to robots with different embodiments in the real world. In particular, we present the Embodiment-aware Transformer (EAT), an architecture that casts this control problem as conditional sequence modeling. EAT outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired robot embodiment, past states, and actions, our EAT model can generate future actions that best fit the current robot embodiment. Experimental results show that EAT can outperform all other alternatives in embodiment-varying tasks, and succeed in an example of real-world evolution tasks: stepping down a stair through updating the morphology alone. We hope that EAT will inspire a new push toward real-world evolution across many domains, where algorithms like EAT can blaze a trail by bridging the field of evolutionary robotics and big data sequence modeling.
translated by 谷歌翻译
Persuasion modeling is a key building block for conversational agents. Existing works in this direction are limited to analyzing textual dialogue corpus. We argue that visual signals also play an important role in understanding human persuasive behaviors. In this paper, we introduce the first multimodal dataset for modeling persuasion behaviors. Our dataset includes 199 dialogue transcriptions and videos captured in a multi-player social deduction game setting, 26,647 utterance level annotations of persuasion strategy, and game level annotations of deduction game outcomes. We provide extensive experiments to show how dialogue context and visual signals benefit persuasion strategy prediction. We also explore the generalization ability of language models for persuasion modeling and the role of persuasion strategies in predicting social deduction game outcomes. Our dataset, code, and models can be found at https://persuasion-deductiongame.socialai-data.org.
translated by 谷歌翻译
Deep reinforcement learning has recently emerged as an appealing alternative for legged locomotion over multiple terrains by training a policy in physical simulation and then transferring it to the real world (i.e., sim-to-real transfer). Despite considerable progress, the capacity and scalability of traditional neural networks are still limited, which may hinder their applications in more complex environments. In contrast, the Transformer architecture has shown its superiority in a wide range of large-scale sequence modeling tasks, including natural language processing and decision-making problems. In this paper, we propose Terrain Transformer (TERT), a high-capacity Transformer model for quadrupedal locomotion control on various terrains. Furthermore, to better leverage Transformer in sim-to-real scenarios, we present a novel two-stage training framework consisting of an offline pretraining stage and an online correction stage, which can naturally integrate Transformer with privileged training. Extensive experiments in simulation demonstrate that TERT outperforms state-of-the-art baselines on different terrains in terms of return, energy consumption and control smoothness. In further real-world validation, TERT successfully traverses nine challenging terrains, including sand pit and stair down, which can not be accomplished by strong baselines.
translated by 谷歌翻译
Graphene quantum dots provide a platform for manipulating electron behaviors in two-dimensional (2D) Dirac materials. Most previous works were of the "forward" type in that the objective was to solve various confinement, transport and scattering problems with given structures that can be generated by, e.g., applying an external electrical field. There are applications such as cloaking or superscattering where the challenging problem of inverse design needs to be solved: finding a quantum-dot structure according to certain desired functional characteristics. A brute-force search of the system configuration based directly on the solutions of the Dirac equation is computational infeasible. We articulate a machine-learning approach to addressing the inverse-design problem where artificial neural networks subject to physical constraints are exploited to replace the rigorous Dirac equation solver. In particular, we focus on the problem of designing a quantum dot structure to generate both cloaking and superscattering in terms of the scattering efficiency as a function of the energy. We construct a physical loss function that enables accurate prediction of the scattering characteristics. We demonstrate that, in the regime of Klein tunneling, the scattering efficiency can be designed to vary over two orders of magnitudes, allowing any scattering curve to be generated from a proper combination of the gate potentials. Our physics-based machine-learning approach can be a powerful design tool for 2D Dirac material-based electronics.
translated by 谷歌翻译
The unsupervised anomaly localization task faces the challenge of missing anomaly sample training, detecting multiple types of anomalies, and dealing with the proportion of the area of multiple anomalies. A separate teacher-student feature imitation network structure and a multi-scale processing strategy combining an image and feature pyramid are proposed to solve these problems. A network module importance search method based on gradient descent optimization is proposed to simplify the network structure. The experimental results show that the proposed algorithm performs better than the feature modeling anomaly localization method on the real industrial product detection dataset in the same period. The multi-scale strategy can effectively improve the effect compared with the benchmark method.
translated by 谷歌翻译